CEE 218X Final Project

COVID Cases and Testing in Alameda and Oakland

COVID Dataset from Alameda County COVID-19 Case/Case Rates by Zip Code GeoJSON API URL: “https://opendata.arcgis.com/datasets/5d6bf4760af64db48b6d053e7569a47b_0.geojson

COVID Dataset from Alameda County COVID-19 Test Rates by Zip Code GeoJSON API URL: “https://opendata.arcgis.com/datasets/5d6bf4760af64db48b6d053e7569a47b_4.geojson

Map of Covid Test Rates in Oakland

Map of Covid Case Rates in Oakland

Equity analysis of income and race in Oakland

Assumptions about the above graph:

My first major assumption is that the breakdown of race in this visual is representative of the breakdown of population in Oakland. Another assumption is the validity, completeness, and accuracy of the data set used. This data set is gathered and produced by the US Census, so we are assuming it is from a credible source on the topic and was gathered in a fair and unbiased way.

Examples of interesting plots and findings from our dashboard

Regression Modeling

We noticed that when trying to see the relationship of COVID cases versus COVID testing that outliers were present and make the results less relevant, so we wanted to find which zipcodes caused these outliers and removed them to allow the results to be more accurate

## 
## Call:
## lm(formula = CaseRates ~ TestRates, data = alameda_grouping_by_zip1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6030.2 -2471.6  -298.1  1952.1 11263.8 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 5.784e+03  5.741e+02  10.074    1e-13 ***
## TestRates   1.601e-01  5.631e-02   2.842  0.00643 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3796 on 51 degrees of freedom
## Multiple R-squared:  0.1367, Adjusted R-squared:  0.1198 
## F-statistic: 8.079 on 1 and 51 DF,  p-value: 0.006426

From this we can identify there are some zipcodes with data points that deviate far from the rest of the data points, specifically 94720, 95377, 94621,94613 and 94603, so we will remove these frames from the regression to help improve it

## 
## Call:
## lm(formula = CaseRates ~ TestRates, data = alameda_grouping_by_zip1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5390.5 -2214.7   -19.1  1527.7  7656.7 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 8665.7717  1187.5665   7.297 3.27e-09 ***
## TestRates     -0.9230     0.3856  -2.394   0.0208 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2794 on 46 degrees of freedom
## Multiple R-squared:  0.1108, Adjusted R-squared:  0.09143 
## F-statistic:  5.73 on 1 and 46 DF,  p-value: 0.02082

As we can see, removing the outliers changed the trend of the graph, and so now we are ready for analysis.

## 
## Call:
## lm(formula = estimate ~ CaseRates, data = alameda_grouping_by_zip2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -425.59 -154.16  -43.57  104.49  834.73 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -15.69792   94.30960  -0.166    0.869    
## CaseRates     0.06805    0.01417   4.803  1.7e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 284.7 on 46 degrees of freedom
## Multiple R-squared:  0.334,  Adjusted R-squared:  0.3195 
## F-statistic: 23.07 on 1 and 46 DF,  p-value: 1.696e-05

Sources: https://calmatters.org/health/coronavirus/2021/06/california-covid-inequality-oakland-rockridge/ https://www.unitedstateszipcodes.org/ https://www.accfb.org/how-covid-19-is-affecting-communities-of-color/ https://www.census.gov/quickfacts/oaklandcitycalifornia https://www.oaklandca.gov/news/2020/local-leaders-announce-covid-19-racial-disparities-task-force